CLAIMS 

What is claimed is: 

1 . A machine-implemented method comprising: 
extracting portions from time-domain speech segments; 

creating feature vectors that represent the portions in a vector space, the feature 
vectors incorporating phase information of the portions; and 

determining a distance between the feature vectors in the vector space. 

2. The machine-implemented method of claim 1, wherein creating feature vectors 
comprises: - 

constructing a matrix Wfrom the portions; and 
decomposing the matrix W. 

3. The machine-implemented method of claim 2, wherein decomposing the matrix 
^comprises extracting global boundary-centric features from the portions. 

4. The machine-implemented method of claim 2, wherein the speech segments each 
include a segment boundary within a phoneme. 

5. The machine-implemented method of claim 4, wherein the speech segments each 
include at least one diphone. 
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6. The machine-implemented method of claim 5, wherein the portions include at 
least one pitch period. 

7. The machine-implemented method of claim 6, wherein decomposing the matrix 
^comprises performing a pitch synchronous singular value analysis on the pitch periods 
of the time-domain segments. 

8. The machine-implemented method of claim 6, wherein the matrix Wis a 2KM * 
N matrix represented by 

W^UZV 7 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, TV is the maximum number of samples among the pitch periods, M is the 
number of segments in the voice table having a segment boundary within the phoneme, 
U is the 2KM x R left singular matrix with row vectors w, (1 < i < 2KM), E is the R x R 
diagonal matrix of singular values s x > s 2 > . . - > s K > 0, V is the AT x R right singular 
matrix with row vectors v, (1 <j <N), R«2KM,md T denotes matrix transposition, 
wherein decomposing the matrix W comprises performing a singular value 
decomposition of W. 

9. The machine-implemented method of claim 8, wherein the pitch periods are zero 
padded to N samples. 
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1 0. The machine-implemented method of claim 9, wherein a feature vector w, is 
calculated as 

Qi = Ui E 

where m, is a row vector associated with a pitch period i, and E is the singular diagonal 
matrix. 

11. The machine-implemented method of claim 1 0, wherein the distance between 
two feature vectors is determined by a metric comprising the cosine of the angle between 
the two feature vectors. 

12. The machine-implemented method of claim 1 1 , wherein the metric comprises a 
closeness measure, C, between two feature vectors, w A and «/ , wherein C is calculated as 

V 2 T 

C(u k , uj) = cos(w*E, w/E) = -ji — — jj— n — — jr 

||m*E|| | m ,E|| 

for any \<k,l <2KM. 

13. The machine-implemented method of claim 12, wherein a difference d(S\,S 2 ) 
between two segments in the voice table, S\ and S 2 , is calculated as 

d(S\,S 2 ) = d 0 (p\, q\) = 1 - C{u P \ , u q \ ) 
where d 0 is the distance between pitch periods p\ and q x ,p\ is the last pitch period .of Si, 

is the first pitch period of S 2 , u P \ is a feature vector associated with pitch period p x , 
and Uqi is a feature vector associated with pitch period q\ . 
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14. The machine-implemented method of claim 1 3, wherein the calculation for the 
difference between two segments in the voice table, S\ and S 2 , is expanded to include a 
plurality of pitch periods from each segment. 

15. The machine-implemented method of claim 1 3, wherein the difference between 
two segments in the voice table, S\ and S 2 , is associated with a discontinuity between S\ 
andS 2 - 

1 6. The machine-implemented method of claim 1 2, wherein a difference d(S\,S 2 ) 
between two segments in the voice table, S\ and S 2 , is calculated as 

d(S u S 2 ) = | do(p u q x ) - d 0 (pup ,) + d 0 (q u q ,) | = | C(u P \ ,w) + C(u q \ ,wn ) - C(u P i ,u q \ ) | 

2 2 

where d 0 is the distance between pitch periods, pi is the last pitch period of Si , p , is the 
first pitch period of a segment contiguous to Si , q x is the first pitch period of S 2 , q , is 
the last pitch period of a segment contiguous to S 2 , u P \ is a feature vector associated 
with pitch period p\ , u<,\ is a feature vector associated with pitch period q\ , u- P \ is a 
feature vector associated with pitch period p , , and Uq\ is a feature vector associated 
with pitch period q i . 

17. The machine-implemented method of claim 2, further comprising associating the 
distance between the feature vectors with speech segments in the voice table. 
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1 8. The machine-implemented method of claim 1 7, further comprising: 

selecting speech segments from the voice table based on the distance between the 
feature vectors. 

19. The machine-implemented method of claim 5, wherein the portions include 
centered pitch periods. 

20. The machine-implemented method of claim 19, wherein the matrix Wis* 
(2(A r -l)+l)M x N matrix represented by 

w=ut: V T 

where K-l is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, Mis the number of segments in the voice table having a segment boundary 
within the phoneme, U is the (2<X-1)+1)M x R left singular matrix with row vectors 
m (1 < i < (2(AM)+l)Af), E is the R x R diagonal matrix of singular values s\ ^ s 2 > . . . 
> s R > 0, V is the AT x R right singular matrix with row vectors vj (1 <j <N), R « 
(2(AM)+1)M), and 7 denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 

2 1 . The machine-implemented method of claim 20, wherein the centered pitch 
periods are symmetrically zero padded to N samples. 
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22. The machine-implemented method of claim 2 1 , wherein a feature vector «, is 
calculated as 

M, = Ui E 

where u, is a row vector associated with a centered pitch period i, and E is the singular 
diagonal matrix. 

23 . The machine-implemented method of claim 22, wherein the distance between 
two feature vectors is determined by a metric comprising a closeness measure, C, 
between two feature vectors, u k and «/ , wherein C is calculated as 

• ,-- W T 

Ut Li 



C{u k , W/) = COS(M t E, M/E) = 



for any 1 <, k, I < (2(^-l)+l)M. 

24. The machine-implemented method of claim 23, wherein a difference d(S\,Si) 
between two segments in the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = C(Un-i,u8o) + C(u5o,Ucri)-C(Un- i ,Uno)-C(U(To,Uai) 

where U n-\ is a feature vector associated with a centered pitch period n-\ ,USo is a 
feature vector associated with a centered pitch period So , U <j\ is a feature vector 
associated with a centered pitch period <T. , is a feature vector associated with a 
centered pitch periodic , and U <jo is a feature vector associated with a centered pitch 
period <Jo. 
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25. A machine-readable medium having instructions to cause a machine to perform a 
machine-implemented method comprising: 

extracting portions from time-domain speech segments; 
creating feature vectors that represent the portions in a vector space, the feature 
vectors incorporating phase information of the portions; and 

determining a distance between the feature vectors in the vector space. 

26. The machine-readable medium of claim 25, wherein creating feature vectors 
comprises: < . • 

constructing a matrix W from the portions; and ' 
decomposing the matrix W. 

27. The machine-readable medium of claim 26, wherein decomposing the matrix W 
comprises extracting global boundary-centric features from the portions. 

28. The machine-readable medium of claim 26, wherein the speech segments each 
include a segment boundary within a phoneme. 

29. The machine-readable medium of claim 28, wherein the speech segments each 
include at least one diphone. 

30. The machine-readable medium of claim 29, wherein the portions include at least 
one pitch period. 
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3 1 . The machine-readable medium of claim 30, wherein decomposing the matrix W 
comprises performing a pitch synchronous singular value analysis on the pitch periods of 
the time-domain segments. 

32. The machine-readable medium of claim 30, wherein the matrix Wis a 2KM * N 
matrix represented by 

W=UZV T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M- is. the 
number of segments in the voice table having a segment boundary within the phoneme, 
U is the 2KM x R left singular matrix with row vectors w, (1 < i < 2KM), E is the R x R 
diagonal matrix of singular values s\ > s 2 > . . . > sr > 0, V is the N x R right singular 
matrix with row vectors v, (1 <j <N), R « 2KM, and T denotes matrix transposition, 
wherein decomposing the matrix W comprises performing a singular value 
decomposition of W. 

33. The machine-readable medium of claim 32, wherein the pitch periods are zero 
padded to N samples. 

34. The machine-readable medium of claim 33, wherein a feature vector w, is 
calculated as 

Hi = U\ E 
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where w, is a row vector associated with a pitch period /, and E is the singular diagonal 
matrix. 

35. The machine-readable medium of claim 34, wherein the distance between two 
feature vectors is determined by a metric comprising the cosine of the angle between the 
two feature vectors. 

36. The machine-readable medium of claim 35, wherein the metric comprises a 
closeness measure, C, between two feature vectors, w*and «/ , wherein C is calculated as 

r2 t 

C(u k , Ui) = COS(w*E, W/E) = Tj — « — m — if 

II w * L II II W ' L II 

for any \<kj <2KM. 

37. The machine-readable medium of claim 36, wherein a difference d(S\,S2) 
between two segments in the voice table, S\ and S 2 , is calculated as 

d(S h S 2 ) = d 0 (pu q\) = 1 - C(S>i , w 9 i ) 
where J 0 is the distance between pitch periods p\ and q u p\ is the last pitch period of Si, 
q x is the first pitch period of S 2 , u P \ is a feature vector associated with pitch period p\ , 
and u q \ is a feature vector associated with pitch period q\. 

38. The machine-readable medium of claim 37, wherein the calculation for the 
difference between two segments in the voice table, Si and S 2 , is expanded to include a 
plurality of pitch periods from each segment. 
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39. The machine-readable medium of claim 37, wherein the difference between two 
segments in the voice table, Si and S 2 , is associated with a discontinuity between S\ and 
S 2 - 

40. The machine-readable medium of claim 36, wherein a difference d(S\,S 2 ) 

between two segments in the voice table, S\ and S 2 , is calculated as 

d(S h S 2 ) = | d 0 (pu qi) - d 0 (pn p 0 + d 0 (q u q i) | = | C(u P \ ,w) + C{u q \ ,wn) - C{u P \ ,u q \)\ 

2 2~ 

where d 0 is the distance between pitch periods, p\ is the last pitch period of S\ , p i is the 
first pitch period of a segment contiguous to S\ , q\ is the first pitch period of S 2 , q i is 
the last pitch period of a segment contiguous to S 2 , u P \ is a feature vector associated 
with pitch period p\ , u g \ is a feature vector associated with pitch period q\ , up\ is a 
feature vector associated with pitch period p , , and m?i is a feature vector associated 
with pitch period q i . 

4 1 . The machine-readable medium of claim 26, wherein the method further 
comprises associating the distance between the feature vectors with speech segments in 
the voice table. 
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42. The machine-readable medium of claim 41 , wherein the method further 
comprises: 

selecting speech segments from the voice table based on the distance between the 
feature vectors. 

43 . The machine-readable medium of claim 29, wherein the portions include 
centered pitch periods. 

44. The machine-readable medium of claim 43 , wherein the matrix W is a 
(2(AM)+1)M x N matrix represented by 

W=ULV T 

where K-\ is the number of centered pitch periods near the segment boundary extracted 
from each segment, Wis the maximum number of samples among the centered pitch 
periods, Mis the number of segments in the voice table having a segment boundary 
within the phoneme, U is the (2(K-\)+\)M x R left singular matrix with row vectors 
m,- (1 < i < (2(AM)+1)M), E is the R x R diagonal matrix of singular values s x ^s 2 ^... 
> s R > 0, V is the N x R right singular matrix with row vectors y (1 <j ^N), R« 
(2(K-\)+\)M), and T denotes matrix transposition, wherein decomposing the matrix W 
comprises performing a singular value decomposition of W. 

45 . The machine-readable medium of claim 44, wherein the centered pitch periods 
are symmetrically zero padded to N samples. 
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46. The machine-readable medium of claim 45, wherein a feature vector w, is 
calculated as 

Hi = Ui E 

where w, is a row vector associated with a centered pitch period i 9 and E is the singular 
diagonal matrix. 

47. The machine-readable medium of claim 46, wherein the distance between two 
feature vectors is determined by a metric comprising a closeness measure, C, between 
two feature vectors, u k and w/ , wherein C is calculated as 

for any 1 < it, / < (2(K-1)+1)M. 

48. The machine-readable medium of claim 47, wherein a difference d(S\,S 2 ) 
between two segments in the voice table, S\ and 52, is calculated as 

d(S h S 2 ) = C(U7r-i,udo)^C(uSo,Uai)'C(U7r-uU7ro)-C(Uao 9 Uax) 

where U n-\ is a feature vector associated with a centered pitch period 7t-\ ,USo is a 
feature vector associated with a centered pitch period So ,U& X is a feature vector 
associated with a centered pitch period <Ti , Uno is a feature vector associated with a 
centered pitch period 7To , and U <jo is a feature vector associated with a centered pitch 
period O*o. 
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49. An apparatus comprising: 

means for extracting portions from time-domain speech segments; 
means for creating feature vectors that represent the portions in a vector space, 
the feature vectors incorporating phase information of the portions; and 

means for determining a distance between the feature vectors in the vector space. 

50. The apparatus of claim 49, wherein creating feature vectors comprises: 
means for constructing a matrix JP from the portions; and 

means for decomposing the matrix W. 

5 1 . The apparatus of claim 50, wherein the means for decomposing the matrix W 
comprises means for extracting global boundary-centric features from the portions. 

52. The apparatus of claim 50, wherein the speech segments each include a segment 
boundary within a phoneme. 

53. The apparatus of claim 52, wherein the speech segments each include at least one 
diphone. 

54. The apparatus of claim 53, wherein the portions include at least one pitch period. 
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55. The apparatus of claim 54, wherein the means for decomposing the matrix W 
comprises means for performing a pitch synchronous singular value analysis on the pitch 
periods of the time-domain segments. 

56. The apparatus of claim 54, wherein the matrix W is a 2KM x N matrix 
represented by 

W=UT.V T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M is the 
number of segments in the voice table having a segment boundary within the phoneme, 
U is the 2KM x R left singular matrix with row vectors «, (1 < i < 2KM), E is the R x R 
diagonal matrix of singular values s x > s 2 > . . • > s R > 0, V is the N x R right singular 
matrix with row vectors v, (1 <j <N), R«2KM,and T denotes matrix transposition, 
wherein decomposing the matrix comprises performing a singular value 
decomposition of W. 

57. The apparatus of claim 56, wherein the pitch periods are zero padded to N 
samples. 

58. The apparatus of claim 57, wherein a feature vector w, is calculated as 

Ui = Ui E 

where m, is a row vector associated with a pitch period i, and E is the singular diagonal 
matrix. 
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59. The apparatus of claim 58, wherein the distance between two feature vectors is 
determined by a metric comprising the cosine of the angle between the two feature 
vectors. 

60. The apparatus of claim 59, wherein the metric comprises a closeness measure, C, 
between two feature vectors, w*and ui , wherein C is calculated as 

for any \ <k y L <2KM : < 

61. The apparatus of claim 60, wherein a difference d(S u S 2 ) between two segments 
in the voice table, Si and S 2 , is calculated as 

d(S\,S 2 ) = d 0 (pu q\) = l- C{u P \ , m ? i ) 
where d 0 is the distance between pitch periods /?, and q u p\ is the last pitch period of Si, 
qi is the first pitch period of S 2 , u P i is a feature vector associated with pitch period p, , 
and uq\ is a feature vector associated with pitch period q\. 

62. The apparatus of claim 61 , wherein the calculation for the difference between 
two segments in the voice table, S\ and S 2 , is expanded to include a plurality of pitch 
periods from each segment. 
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63. The apparatus of claim 61, wherein the difference between two segments in the 
voice table, S\ and 5 2 , is associated with a discontinuity between S\ and S 2 . 

64. The apparatus of claim 60, wherein a difference d(S h S 2 ) between two segments 
in the voice table, S\ and S 2 , is calculated as 

d(S h S 2 ) = | d 0 (pu qx) - d 0 (p u p i) + tfo(?i,? i) | = j C(r7 P i ,z7pi) + C(t7 g i - C(w,i , w,i ) | 

2 2 

where d 0 is the distance between pitch periods, p\ is the last pitch period of S\ , p i is the 

first pitch period of a segment contiguous to S\ , q\ is the first pitch period of S 2 , q , is 

the last pitch period of a segment contiguous to S 2 , 5>i is a feature vector associated 

with pitch period p\ , m^i is a feature vector associated with pitch period q x , up\ is a 

feature vector associated with pitch period p x , and Uq\ is a feature vector associated 

with pitch period ^ i . 

65. The apparatus of claim 50, further comprising means for associating the distance 
between the feature vectors with speech segments in the voice table. 

66. The apparatus of claim 65, further comprising: 

means for selecting speech segments from the voice table based on the distance 
between the feature vectors. 

67. The apparatus of claim 53, wherein the portions include centered pitch periods. 
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68. The apparatus of claim 67, wherein the matrix Wis a (2(AT-1)+1)M x //matrix 
represented by 

W=UL V T 

where KA is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, Mis the number of segments in the voice table having a segment boundary 
within the phoneme, U is the (2(K-\)+\)M * R left singular matrix with row vectors 
m (1 < i < (2(AT-1)+1)M), E is the R * diagonal matrix of singular values s\ > s 2 > . . - 
.>sr>0, FistheA^x bright singular matrix with row vectors v y (l <> <N), R« 
;(2(Ar-l)+l)A0> 311(1 7 denotes matrix transposition, wherein decomposing the matrix 
comprises performing a singular value decomposition of W. 

69. The apparatus of claim 68, wherein the centered pitch periods are symmetrically 
zero padded to N samples. 

70. The apparatus of claim 69, wherein a feature vector w, is calculated as 

Ui = Ui E 

where m is a row vector associated with a centered pitch period i, and E is the singular 
diagonal matrix. 

7 1 . The apparatus of claim 70, wherein the distance between two feature vectors is 
determined by a metric comprising a closeness measure, C, between two feature vectors, 
Uk and ui , wherein C is calculated as 
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foranyl<M <(2(*-l)+l)M. 



72. The apparatus of claim 7 1 , wherein a difference d(S h S 2 ) between two segments 
in the voice table, Si and S 2 , is calculated as 

d(SuS 2 ) = C(Un:-uUdo) + C(uSo,u<7i)-C(Un:- i ,U7ro)-C(Uao,U(ji) 

where U n-\ is a feature vector associated with a centered pitch period 7T-i, W<5o is a 

feature vector associated with a centered pitch period <5o , W o"i is a feature vector 

associated with a centered pitch period cTi , U /ro is a feature vector associated with a 

centered pitch period , and U<jo is a feature vector associated with a centered pitch 

period <To. 

73. A system comprising: 

a processing unit coupled to a memory through a bus; and 
a process executed from the memory by the processing unit to cause the 
processing unit to extract portions from time-domain speech segments, create feature 
vectors that represent the portions in a vector space, the feature vectors incorporating 
phase information of the portions, and determine a distance between the feature vectors 
in the vector space. 
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74. The system of claim 73, wherein the process further causes the processing unit, 
when creating feature vectors, to construct a matrix Wfxom the portions, and decompose 
the matrix W . 

75. The system of claim 74, wherein the process further causes the processing unit, 
when decomposing the matrix W> to extract global boundary-centric features from the 
portions. 

76. The system of claim 74, wherein the speech segments each include a segment 
boundary within a phoneme. 

77. The system of claim 76, wherein the speech segments each include at least one 
diphone. 

78. The system of claim 77, wherein the portions include at least one pitch period. 

79. The system of claim 78, wherein the process further causes the processing unit, 
when decomposing the matrix W 9 to perform a pitch synchronous singular value analysis 
on the pitch periods of the time-domain segments. 
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80. The system of claim 78, wherein the matrix W is a 2KM x N matrix represented 
by 

w=ui:v T 

where K is the number of pitch periods near the segment boundary extracted from each 
segment, N is the maximum number of samples among the pitch periods, M is the 
number of segments in the voice table having a segment boundary within the phoneme, 
U is the 2AM x R left singular matrix with row vectors «, (1 < i < 2KM), E is the R * R 
diagonal matrix of singular values s { ^s 2 > ... ^s R >0, Visthe N x R right singular 
matrix with row vectors y, (1 <j <N), R«2KM,snd T denotes matrix transposition, 
wherein decomposing the matrix comprises performing a singular value 
decomposition of W. 

8 1 . The system of claim 80, wherein the pitch periods are zero padded to N samples. 

82. The system of claim 8 1 , wherein a feature vector w, is calculated as 

Uj = Uj E 

where u, is a row vector associated with a pitch period i, and E is the singular diagonal 
matrix. 

83. The system of claim 82, wherein the distance between two feature vectors is 
determined by a metric comprising the cosine of the angle between the two feature 
vectors. 
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84. The system of claim 83, wherein the metric comprises a closeness measure, C, 
between two feature vectors, u k and u t , wherein C is calculated as 

v 2 T 

C(«* , M/) = COS(l/A W/E) - Tj — jj—jj Zni 

for any 1<A:,/ <2/CM. 

85. The system of claim 84, wherein a difference d(S h S 2 ) between two segments in 
the voice table, S\ and 5 2 , is calculated as 

d(S h S 2 ) = d 0 (p u q\) =? 1 - C(i7 P i , w,i ) 
where d 0 is the distance between pitch periods p\ and q u p\ is the last pitch period of Si, 
qi is the first pitch period of S 2 , u P \ is a feature vector associated with pitch period p x , 
and u q \ is a feature vector associated with pitch period q x . 

86. The system of claim 85, wherein the calculation for the difference between two 
segments in the voice table, S x and S 2 , is expanded to include a plurality of pitch periods 
from each segment. 

87. The system of claim 85, wherein the difference between two segments in the 
voice table, S\ and S 2 , is associated with a discontinuity between S\ and S 2 . 
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88. The system of claim 84, wherein a difference d(S h S 2 ) between two segments in 
the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = | d 0 (pu ?,) -4fr,,p,) + 4>(g.,gi) \ = \ C(u P i ,upi) + C(u q \ ,Ug\ ) - C( u P x , t7 9 . ) | 

2 2 

where d 0 is the distance between pitch periods, p x is the last pitch period of S x , p , is the 
first pitch period of a segment contiguous to S x , q\ is the first pitch period of 5 2 , q i is 
the last pitch period of a segment contiguous to S 2 , u P \ is a feature vector associated 
with pitch period p x , u q \ is a feature vector associated with pitch period q\ , m is a 
feature vector associated with pitch period p , , and m\ is a feature vector associated 
with pitch period q i . ' u ' ' 

89. The system of claim 74, wherein the process further causes the processing unit to 
associate the distance between the feature vectors with speech segments in the voice 
table. 

90. The system of claim 89, wherein the process further causes the processing unit to 
select speech segments from the voice table based on the distance between the feature 
vectors. 

91. The system of claim 77, wherein the portions include centered pitch periods. 
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92. The system of claim 91, wherein the matrix Wis a (2(AM)+l)Af * N matrix 
represented by 

W=UZ V T 

where K-\ is the number of centered pitch periods near the segment boundary extracted 
from each segment, N is the maximum number of samples among the centered pitch 
periods, Mis the number of segments in the voice table having a segment boundary 
within the phoneme, Uis the (2(AT-1)+1)M * R left singular matrix with row vectors 

m (1 < i < (2(AT-1)+1)M), E is the R * R diagonal matrix of singular values s ] > s 2 > 

> sr > 0, V is the N x R right singular matrix with row vectors y, (1 <y <N), R« 
(2(AM)+1)M), a° a ^ denotes matrix transposition, wherein decomposing the matrix W v : - 
comprises performing a singular value decomposition of W. 

93 . The system of claim 92, wherein the centered pitch periods are symmetrically 
zero padded to N samples. 

94. The system of claim 93, wherein a feature vector m, is calculated as 

Hi = m, E 

where is a row vector associated with a centered pitch period i, and E is the singular 
diagonal matrix. 

95 . The system of claim 94, wherein the distance between two feature vectors is 
determined by a metric comprising a closeness measure, C, between two feature vectors, 
u k and «/ , wherein C is calculated as 



Attorney Docket: 4860.P3128 



-51- 



foranyl<M<(2(K-l)+l)M. 



96. The system of claim 95, wherein a difference d(S u S 2 ) between two segments in 
the voice table, Si and S 2 , is calculated as 

d(S h S 2 ) = C(U 7 r-i,USo) + C(USo,U(j i )-C(U7r-i,U7[o)-C(UcTo,U<j 1 ) 

where U n-\ is a feature vector associated with a centered pitch period Tt-\ , USo is a 
feature vector associated with a centered pitch period So , U <ji is a feature vector 
associated with a centered pitch period G\ , is a feature vector associated with a 
centered pitch period , and U<jo is a feature vector associated with a centered pitch 
period <To. 

97. A machine-implemented method comprising: 

gathering time-domain samples from recorded speech segments; 
extracting features that represent the samples; 

determining a discontinuity between the segments, the discontinuity based on a 
distance between the features. 

98. The machine-implemented method of claim 97, wherein the time-domain 
samples include pitch periods surrounding a boundary of a phoneme. 
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99. The machine-implemented method of claim 98, wherein the features incorporate 
phase information of the pitch periods. 



100. The machine-implemented method of claim 99, wherein extracting features 
comprises constructing a matrix from the time-domain samples and decomposing the 
matrix. 

101 . A machine-readable medium having instructions to cause a machine to perform a 
machine-implemented method comprising: 

gathering time-domain samples from recorded speech segments; 
extracting features that represent the samples; 

determining a discontinuity between the segments, the discontinuity based on a 
distance between the features. 

1 02. The machine-readable medium of claim 101, wherein the time-domain samples 
include pitch periods surrounding a boundary of a phoneme. 

103. The machine-readable medium of claim 102, wherein the features incorporate 
phase information of the pitch periods. 

1 04. The machine-readable medium of claim 1 03, wherein extracting features 
comprises constructing a matrix from the time-domain samples and decomposing the 
matrix. 
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105. An apparatus comprising: 

means for gathering time-domain samples from recorded speech segments; 
means for extracting features that represent the samples; 
means for determining a discontinuity between the segments, the discontinuity 
based on a distance between the features. 

1 06. The apparatus of claim 1 05, wherein the time-domain samples include pitch 
periods surrounding a boundary of a phoneme. 

1 07. The apparatus of claim 1 06, wherein the features incorporate phase information 
of the pitch periods. 

1 08. The apparatus of claim 1 07, wherein the means for extracting features comprises 
means for constructing a matrix from the time-domain samples and means for 
decomposing the matrix. 

109. A system comprising: 

a processing unit coupled to a memory through a bus; and 
a process executed from the memory by the processing unit to cause the processing unit 
to gather time-domain samples from recorded speech segments, extract features that 
represent the samples, and determine a discontinuity between the segments, the 
discontinuity based on a distance between the features. 
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1 1 0. The system of claim 1 09, wherein the time-domain samples include pitch periods 
surrounding a boundary of a phoneme. 

111. The system of claim 1 1 0, wherein the features incorporate phase information of 
the pitch periods. 

112. The system of claim 111, wherein the process further causes the processing unit, 
when extracting features, to construct a matrix from the time-domain samples and 
decompose the matrix. 
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